Overview

Brought to you by YData

Dataset statistics

Number of variables13
Number of observations15411
Missing cells0
Missing cells (%)0.0%
Duplicate rows162
Duplicate rows (%)1.1%
Total size in memory1.6 MiB
Average record size in memory112.0 B

Variable types

Text2
Categorical4
Numeric7

Alerts

Dataset has 162 (1.1%) duplicate rowsDuplicates
brand is highly overall correlated with engine and 3 other fieldsHigh correlation
engine is highly overall correlated with brand and 2 other fieldsHigh correlation
km_driven is highly overall correlated with vehicle_ageHigh correlation
max_power is highly overall correlated with brand and 4 other fieldsHigh correlation
mileage is highly overall correlated with max_powerHigh correlation
selling_price is highly overall correlated with brand and 2 other fieldsHigh correlation
transmission_type is highly overall correlated with brand and 1 other fieldsHigh correlation
vehicle_age is highly overall correlated with km_drivenHigh correlation
fuel_type is highly imbalanced (50.6%) Imbalance
km_driven is highly skewed (γ1 = 28.17271087) Skewed

Reproduction

Analysis started2025-02-18 12:49:42.349404
Analysis finished2025-02-18 12:49:50.134020
Duration7.78 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

Distinct121
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:50.457303image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length22
Median length19
Mean length12.612809
Min length5

Characters and Unicode

Total characters194376
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.1%

Sample

1st rowMaruti Alto
2nd rowHyundai Grand
3rd rowHyundai i20
4th rowMaruti Alto
5th rowFord Ecosport
ValueCountFrequency (%)
maruti 4992
 
15.4%
hyundai 2982
 
9.2%
swift 1671
 
5.1%
honda 1485
 
4.6%
mahindra 1011
 
3.1%
dzire 913
 
2.8%
i20 906
 
2.8%
toyota 793
 
2.4%
ford 790
 
2.4%
alto 778
 
2.4%
Other values (142) 16182
49.8%
2025-02-18T18:19:50.976596image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 21446
 
11.0%
17092
 
8.8%
i 16613
 
8.5%
r 12836
 
6.6%
t 12474
 
6.4%
o 11542
 
5.9%
n 11422
 
5.9%
u 9577
 
4.9%
d 8077
 
4.2%
e 7444
 
3.8%
Other values (52) 65853
33.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 137070
70.5%
Uppercase Letter 34981
 
18.0%
Space Separator 17092
 
8.8%
Decimal Number 4413
 
2.3%
Dash Punctuation 820
 
0.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 6855
19.6%
H 4548
13.0%
S 2721
 
7.8%
C 2345
 
6.7%
V 2316
 
6.6%
R 1748
 
5.0%
D 1614
 
4.6%
A 1575
 
4.5%
W 1565
 
4.5%
T 1493
 
4.3%
Other values (16) 8201
23.4%
Lowercase Letter
ValueCountFrequency (%)
a 21446
15.6%
i 16613
12.1%
r 12836
9.4%
t 12474
9.1%
o 11542
8.4%
n 11422
8.3%
u 9577
7.0%
d 8077
 
5.9%
e 7444
 
5.4%
y 4621
 
3.4%
Other values (14) 21018
15.3%
Decimal Number
ValueCountFrequency (%)
0 2073
47.0%
2 906
20.5%
1 491
 
11.1%
5 473
 
10.7%
3 190
 
4.3%
4 112
 
2.5%
6 94
 
2.1%
7 60
 
1.4%
9 8
 
0.2%
8 6
 
0.1%
Space Separator
ValueCountFrequency (%)
17092
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 820
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 172051
88.5%
Common 22325
 
11.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 21446
 
12.5%
i 16613
 
9.7%
r 12836
 
7.5%
t 12474
 
7.3%
o 11542
 
6.7%
n 11422
 
6.6%
u 9577
 
5.6%
d 8077
 
4.7%
e 7444
 
4.3%
M 6855
 
4.0%
Other values (40) 53765
31.2%
Common
ValueCountFrequency (%)
17092
76.6%
0 2073
 
9.3%
2 906
 
4.1%
- 820
 
3.7%
1 491
 
2.2%
5 473
 
2.1%
3 190
 
0.9%
4 112
 
0.5%
6 94
 
0.4%
7 60
 
0.3%
Other values (2) 14
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 194376
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 21446
 
11.0%
17092
 
8.8%
i 16613
 
8.5%
r 12836
 
6.6%
t 12474
 
6.4%
o 11542
 
5.9%
n 11422
 
5.9%
u 9577
 
4.9%
d 8077
 
4.2%
e 7444
 
3.8%
Other values (52) 65853
33.9%

brand
Categorical

High correlation 

Distinct32
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
Maruti
4992 
Hyundai
2982 
Honda
1485 
Mahindra
1011 
Toyota
793 
Other values (27)
4148 

Length

Max length13
Median length12
Mean length6.2812277
Min length2

Characters and Unicode

Total characters96800
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowMaruti
2nd rowHyundai
3rd rowHyundai
4th rowMaruti
5th rowFord

Common Values

ValueCountFrequency (%)
Maruti 4992
32.4%
Hyundai 2982
19.3%
Honda 1485
 
9.6%
Mahindra 1011
 
6.6%
Toyota 793
 
5.1%
Ford 790
 
5.1%
Volkswagen 620
 
4.0%
Renault 536
 
3.5%
BMW 439
 
2.8%
Tata 430
 
2.8%
Other values (22) 1333
 
8.6%

Length

2025-02-18T18:19:51.113961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
maruti 4992
32.3%
hyundai 2982
19.3%
honda 1485
 
9.6%
mahindra 1011
 
6.5%
toyota 793
 
5.1%
ford 790
 
5.1%
volkswagen 620
 
4.0%
renault 536
 
3.5%
bmw 439
 
2.8%
tata 430
 
2.8%
Other values (22) 1384
 
9.0%

Most occurring characters

ValueCountFrequency (%)
a 15011
15.5%
i 9257
9.6%
u 8957
9.3%
r 7268
 
7.5%
n 7223
 
7.5%
d 7183
 
7.4%
t 6926
 
7.2%
M 6819
 
7.0%
o 4930
 
5.1%
H 4467
 
4.6%
Other values (33) 18759
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 79702
82.3%
Uppercase Letter 16708
 
17.3%
Dash Punctuation 339
 
0.4%
Space Separator 51
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 15011
18.8%
i 9257
11.6%
u 8957
11.2%
r 7268
9.1%
n 7223
9.1%
d 7183
9.0%
t 6926
8.7%
o 4930
 
6.2%
y 3779
 
4.7%
e 2682
 
3.4%
Other values (11) 6486
8.1%
Uppercase Letter
ValueCountFrequency (%)
M 6819
40.8%
H 4467
26.7%
T 1223
 
7.3%
F 792
 
4.7%
B 779
 
4.7%
V 640
 
3.8%
R 589
 
3.5%
W 439
 
2.6%
S 336
 
2.0%
A 193
 
1.2%
Other values (10) 431
 
2.6%
Dash Punctuation
ValueCountFrequency (%)
- 339
100.0%
Space Separator
ValueCountFrequency (%)
51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 96410
99.6%
Common 390
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 15011
15.6%
i 9257
9.6%
u 8957
9.3%
r 7268
 
7.5%
n 7223
 
7.5%
d 7183
 
7.5%
t 6926
 
7.2%
M 6819
 
7.1%
o 4930
 
5.1%
H 4467
 
4.6%
Other values (31) 18369
19.1%
Common
ValueCountFrequency (%)
- 339
86.9%
51
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 96800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 15011
15.5%
i 9257
9.6%
u 8957
9.3%
r 7268
 
7.5%
n 7223
 
7.5%
d 7183
 
7.4%
t 6926
 
7.2%
M 6819
 
7.0%
o 4930
 
5.1%
H 4467
 
4.6%
Other values (33) 18759
19.4%

model
Text

Distinct120
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:51.446172image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length5.3315813
Min length1

Characters and Unicode

Total characters82165
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.1%

Sample

1st rowAlto
2nd rowGrand
3rd rowi20
4th rowAlto
5th rowEcosport
ValueCountFrequency (%)
swift 1671
 
9.8%
dzire 913
 
5.4%
i20 906
 
5.3%
alto 778
 
4.6%
city 757
 
4.4%
wagon 717
 
4.2%
r 717
 
4.2%
grand 580
 
3.4%
innova 545
 
3.2%
verna 492
 
2.9%
Other values (111) 8965
52.6%
2025-02-18T18:19:51.966926image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 7356
 
9.0%
o 6612
 
8.0%
a 6435
 
7.8%
r 5568
 
6.8%
t 5548
 
6.8%
e 4762
 
5.8%
n 4199
 
5.1%
l 2415
 
2.9%
S 2385
 
2.9%
C 2345
 
2.9%
Other values (52) 34540
42.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57368
69.8%
Uppercase Letter 18273
 
22.2%
Decimal Number 4413
 
5.4%
Space Separator 1630
 
2.0%
Dash Punctuation 481
 
0.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2385
13.1%
C 2345
12.8%
V 1676
9.2%
D 1444
 
7.9%
A 1382
 
7.6%
R 1159
 
6.3%
W 1126
 
6.2%
E 1077
 
5.9%
I 947
 
5.2%
G 810
 
4.4%
Other values (16) 3922
21.5%
Lowercase Letter
ValueCountFrequency (%)
i 7356
12.8%
o 6612
11.5%
a 6435
11.2%
r 5568
9.7%
t 5548
9.7%
e 4762
8.3%
n 4199
 
7.3%
l 2415
 
4.2%
z 2028
 
3.5%
f 1771
 
3.1%
Other values (14) 10674
18.6%
Decimal Number
ValueCountFrequency (%)
0 2073
47.0%
2 906
20.5%
1 491
 
11.1%
5 473
 
10.7%
3 190
 
4.3%
4 112
 
2.5%
6 94
 
2.1%
7 60
 
1.4%
9 8
 
0.2%
8 6
 
0.1%
Space Separator
ValueCountFrequency (%)
1630
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 481
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 75641
92.1%
Common 6524
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 7356
 
9.7%
o 6612
 
8.7%
a 6435
 
8.5%
r 5568
 
7.4%
t 5548
 
7.3%
e 4762
 
6.3%
n 4199
 
5.6%
l 2415
 
3.2%
S 2385
 
3.2%
C 2345
 
3.1%
Other values (40) 28016
37.0%
Common
ValueCountFrequency (%)
0 2073
31.8%
1630
25.0%
2 906
13.9%
1 491
 
7.5%
- 481
 
7.4%
5 473
 
7.3%
3 190
 
2.9%
4 112
 
1.7%
6 94
 
1.4%
7 60
 
0.9%
Other values (2) 14
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 82165
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 7356
 
9.0%
o 6612
 
8.0%
a 6435
 
7.8%
r 5568
 
6.8%
t 5548
 
6.8%
e 4762
 
5.8%
n 4199
 
5.1%
l 2415
 
2.9%
S 2385
 
2.9%
C 2345
 
2.9%
Other values (52) 34540
42.0%

vehicle_age
Real number (ℝ)

High correlation 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.0363377
Minimum0
Maximum29
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:52.136027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q14
median6
Q38
95-th percentile12
Maximum29
Range29
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.0132915
Coefficient of variation (CV)0.499192
Kurtosis0.76006867
Mean6.0363377
Median Absolute Deviation (MAD)2
Skewness0.83371202
Sum93026
Variance9.0799254
MonotonicityNot monotonic
2025-02-18T18:19:52.267758image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
4 2252
14.6%
5 2117
13.7%
3 1926
12.5%
6 1924
12.5%
7 1438
9.3%
8 1282
8.3%
2 1145
7.4%
9 1027
6.7%
10 710
 
4.6%
11 551
 
3.6%
Other values (14) 1039
6.7%
ValueCountFrequency (%)
0 5
 
< 0.1%
1 221
 
1.4%
2 1145
7.4%
3 1926
12.5%
4 2252
14.6%
5 2117
13.7%
6 1924
12.5%
7 1438
9.3%
8 1282
8.3%
9 1027
6.7%
ValueCountFrequency (%)
29 1
 
< 0.1%
25 1
 
< 0.1%
22 1
 
< 0.1%
21 3
 
< 0.1%
19 5
 
< 0.1%
18 11
 
0.1%
17 17
 
0.1%
16 25
 
0.2%
15 90
0.6%
14 129
0.8%

km_driven
Real number (ℝ)

High correlation  Skewed 

Distinct3688
Distinct (%)23.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55616.481
Minimum100
Maximum3800000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:52.435647image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile10000
Q130000
median50000
Q370000
95-th percentile120000
Maximum3800000
Range3799900
Interquartile range (IQR)40000

Descriptive statistics

Standard deviation51618.548
Coefficient of variation (CV)0.92811605
Kurtosis1846.5268
Mean55616.481
Median Absolute Deviation (MAD)20000
Skewness28.172711
Sum8.5710558 × 108
Variance2.6644745 × 109
MonotonicityNot monotonic
2025-02-18T18:19:52.604815image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000 553
 
3.6%
40000 477
 
3.1%
70000 466
 
3.0%
60000 449
 
2.9%
30000 363
 
2.4%
80000 352
 
2.3%
35000 334
 
2.2%
25000 323
 
2.1%
20000 298
 
1.9%
120000 293
 
1.9%
Other values (3678) 11503
74.6%
ValueCountFrequency (%)
100 1
 
< 0.1%
581 1
 
< 0.1%
1000 10
0.1%
1001 3
 
< 0.1%
1005 1
 
< 0.1%
1010 1
 
< 0.1%
1041 1
 
< 0.1%
1100 1
 
< 0.1%
1198 1
 
< 0.1%
1200 6
< 0.1%
ValueCountFrequency (%)
3800000 1
< 0.1%
1325000 1
< 0.1%
950000 1
< 0.1%
850000 1
< 0.1%
830000 1
< 0.1%
825000 1
< 0.1%
820000 1
< 0.1%
720000 1
< 0.1%
675000 1
< 0.1%
590000 1
< 0.1%

seller_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
Dealer
9539 
Individual
5699 
Trustmark Dealer
 
173

Length

Max length16
Median length6
Mean length7.5914606
Min length6

Characters and Unicode

Total characters116992
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIndividual
2nd rowIndividual
3rd rowIndividual
4th rowIndividual
5th rowDealer

Common Values

ValueCountFrequency (%)
Dealer 9539
61.9%
Individual 5699
37.0%
Trustmark Dealer 173
 
1.1%

Length

2025-02-18T18:19:52.764628image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-18T18:19:52.935036image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
dealer 9712
62.3%
individual 5699
36.6%
trustmark 173
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e 19424
16.6%
a 15584
13.3%
l 15411
13.2%
i 11398
9.7%
d 11398
9.7%
r 10058
8.6%
D 9712
8.3%
u 5872
 
5.0%
n 5699
 
4.9%
I 5699
 
4.9%
Other values (7) 6737
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 101235
86.5%
Uppercase Letter 15584
 
13.3%
Space Separator 173
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 19424
19.2%
a 15584
15.4%
l 15411
15.2%
i 11398
11.3%
d 11398
11.3%
r 10058
9.9%
u 5872
 
5.8%
n 5699
 
5.6%
v 5699
 
5.6%
s 173
 
0.2%
Other values (3) 519
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
D 9712
62.3%
I 5699
36.6%
T 173
 
1.1%
Space Separator
ValueCountFrequency (%)
173
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 116819
99.9%
Common 173
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 19424
16.6%
a 15584
13.3%
l 15411
13.2%
i 11398
9.8%
d 11398
9.8%
r 10058
8.6%
D 9712
8.3%
u 5872
 
5.0%
n 5699
 
4.9%
I 5699
 
4.9%
Other values (6) 6564
 
5.6%
Common
ValueCountFrequency (%)
173
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 116992
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 19424
16.6%
a 15584
13.3%
l 15411
13.2%
i 11398
9.7%
d 11398
9.7%
r 10058
8.6%
D 9712
8.3%
u 5872
 
5.0%
n 5699
 
4.9%
I 5699
 
4.9%
Other values (7) 6737
 
5.8%

fuel_type
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
Petrol
7643 
Diesel
7419 
CNG
 
301
LPG
 
44
Electric
 
4

Length

Max length8
Median length6
Mean length5.9333593
Min length3

Characters and Unicode

Total characters91439
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPetrol
2nd rowPetrol
3rd rowPetrol
4th rowPetrol
5th rowDiesel

Common Values

ValueCountFrequency (%)
Petrol 7643
49.6%
Diesel 7419
48.1%
CNG 301
 
2.0%
LPG 44
 
0.3%
Electric 4
 
< 0.1%

Length

2025-02-18T18:19:53.137774image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-18T18:19:53.231732image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
petrol 7643
49.6%
diesel 7419
48.1%
cng 301
 
2.0%
lpg 44
 
0.3%
electric 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 22485
24.6%
l 15066
16.5%
P 7687
 
8.4%
t 7647
 
8.4%
r 7647
 
8.4%
o 7643
 
8.4%
i 7423
 
8.1%
D 7419
 
8.1%
s 7419
 
8.1%
G 345
 
0.4%
Other values (5) 658
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 75338
82.4%
Uppercase Letter 16101
 
17.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 22485
29.8%
l 15066
20.0%
t 7647
 
10.2%
r 7647
 
10.2%
o 7643
 
10.1%
i 7423
 
9.9%
s 7419
 
9.8%
c 8
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 7687
47.7%
D 7419
46.1%
G 345
 
2.1%
C 301
 
1.9%
N 301
 
1.9%
L 44
 
0.3%
E 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 91439
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 22485
24.6%
l 15066
16.5%
P 7687
 
8.4%
t 7647
 
8.4%
r 7647
 
8.4%
o 7643
 
8.4%
i 7423
 
8.1%
D 7419
 
8.1%
s 7419
 
8.1%
G 345
 
0.4%
Other values (5) 658
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 91439
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 22485
24.6%
l 15066
16.5%
P 7687
 
8.4%
t 7647
 
8.4%
r 7647
 
8.4%
o 7643
 
8.4%
i 7423
 
8.1%
D 7419
 
8.1%
s 7419
 
8.1%
G 345
 
0.4%
Other values (5) 658
 
0.7%

transmission_type
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size240.8 KiB
Manual
12225 
Automatic
3186 

Length

Max length9
Median length6
Mean length6.6202063
Min length6

Characters and Unicode

Total characters102024
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowManual
2nd rowManual
3rd rowManual
4th rowManual
5th rowManual

Common Values

ValueCountFrequency (%)
Manual 12225
79.3%
Automatic 3186
 
20.7%

Length

2025-02-18T18:19:53.339336image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-18T18:19:53.426852image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
manual 12225
79.3%
automatic 3186
 
20.7%

Most occurring characters

ValueCountFrequency (%)
a 27636
27.1%
u 15411
15.1%
M 12225
12.0%
n 12225
12.0%
l 12225
12.0%
t 6372
 
6.2%
A 3186
 
3.1%
o 3186
 
3.1%
m 3186
 
3.1%
i 3186
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 86613
84.9%
Uppercase Letter 15411
 
15.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 27636
31.9%
u 15411
17.8%
n 12225
14.1%
l 12225
14.1%
t 6372
 
7.4%
o 3186
 
3.7%
m 3186
 
3.7%
i 3186
 
3.7%
c 3186
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
M 12225
79.3%
A 3186
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 102024
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 27636
27.1%
u 15411
15.1%
M 12225
12.0%
n 12225
12.0%
l 12225
12.0%
t 6372
 
6.2%
A 3186
 
3.1%
o 3186
 
3.1%
m 3186
 
3.1%
i 3186
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 102024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 27636
27.1%
u 15411
15.1%
M 12225
12.0%
n 12225
12.0%
l 12225
12.0%
t 6372
 
6.2%
A 3186
 
3.1%
o 3186
 
3.1%
m 3186
 
3.1%
i 3186
 
3.1%

mileage
Real number (ℝ)

High correlation 

Distinct411
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.701151
Minimum4
Maximum33.54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:53.833884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile12.8
Q117
median19.67
Q322.7
95-th percentile26.59
Maximum33.54
Range29.54
Interquartile range (IQR)5.7

Descriptive statistics

Standard deviation4.1712646
Coefficient of variation (CV)0.21172695
Kurtosis-0.16752504
Mean19.701151
Median Absolute Deviation (MAD)2.87
Skewness0.10496103
Sum303614.44
Variance17.399448
MonotonicityNot monotonic
2025-02-18T18:19:53.962204image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.9 632
 
4.1%
18.6 404
 
2.6%
17 339
 
2.2%
24.3 292
 
1.9%
28.4 279
 
1.8%
20.51 224
 
1.5%
20.36 221
 
1.4%
22 201
 
1.3%
15.1 199
 
1.3%
19.7 197
 
1.3%
Other values (401) 12423
80.6%
ValueCountFrequency (%)
4 1
 
< 0.1%
6 1
 
< 0.1%
7.5 2
 
< 0.1%
7.81 6
< 0.1%
7.94 1
 
< 0.1%
8.45 6
< 0.1%
8.5 2
 
< 0.1%
8.6 1
 
< 0.1%
8.77 1
 
< 0.1%
8.9 3
< 0.1%
ValueCountFrequency (%)
33.54 24
 
0.2%
33.44 14
 
0.1%
32.52 13
 
0.1%
32.26 4
 
< 0.1%
31.79 16
 
0.1%
30.48 6
 
< 0.1%
30.47 3
 
< 0.1%
30.46 14
 
0.1%
28.4 279
1.8%
28.09 136
0.9%

engine
Real number (ℝ)

High correlation 

Distinct110
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1486.0578
Minimum793
Maximum6592
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:54.106216image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum793
5-th percentile998
Q11197
median1248
Q31582
95-th percentile2523
Maximum6592
Range5799
Interquartile range (IQR)385

Descriptive statistics

Standard deviation521.1067
Coefficient of variation (CV)0.35066383
Kurtosis4.329217
Mean1486.0578
Median Absolute Deviation (MAD)249
Skewness1.6664666
Sum22901636
Variance271552.19
MonotonicityNot monotonic
2025-02-18T18:19:54.293305image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1197 2436
15.8%
1248 1668
 
10.8%
998 1174
 
7.6%
1498 1095
 
7.1%
2179 669
 
4.3%
1497 662
 
4.3%
796 501
 
3.3%
1199 469
 
3.0%
1396 465
 
3.0%
1198 437
 
2.8%
Other values (100) 5835
37.9%
ValueCountFrequency (%)
793 9
 
0.1%
796 501
3.3%
799 207
 
1.3%
998 1174
7.6%
999 228
 
1.5%
1047 46
 
0.3%
1061 76
 
0.5%
1086 288
 
1.9%
1120 93
 
0.6%
1186 48
 
0.3%
ValueCountFrequency (%)
6592 1
 
< 0.1%
5998 3
< 0.1%
5461 4
< 0.1%
4806 3
< 0.1%
4663 1
 
< 0.1%
4395 2
< 0.1%
4367 3
< 0.1%
4163 1
 
< 0.1%
4134 3
< 0.1%
3855 1
 
< 0.1%

max_power
Real number (ℝ)

High correlation 

Distinct342
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.58825
Minimum38.4
Maximum626
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:54.480514image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum38.4
5-th percentile58.16
Q174
median88.5
Q3117.3
95-th percentile184
Maximum626
Range587.6
Interquartile range (IQR)43.3

Descriptive statistics

Standard deviation42.972979
Coefficient of variation (CV)0.42721667
Kurtosis11.881846
Mean100.58825
Median Absolute Deviation (MAD)15.5
Skewness2.4851294
Sum1550165.6
Variance1846.6769
MonotonicityNot monotonic
2025-02-18T18:19:54.603728image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74 791
 
5.1%
88.5 589
 
3.8%
81.8 544
 
3.5%
98.6 424
 
2.8%
67.04 390
 
2.5%
67 338
 
2.2%
140 319
 
2.1%
81.86 305
 
2.0%
82 287
 
1.9%
47.3 286
 
1.9%
Other values (332) 11138
72.3%
ValueCountFrequency (%)
38.4 5
 
< 0.1%
40.3 14
 
0.1%
46.3 195
1.3%
47 12
 
0.1%
47.3 286
1.9%
53 3
 
< 0.1%
53.26 18
 
0.1%
53.3 132
0.9%
53.64 54
 
0.4%
57.5 19
 
0.1%
ValueCountFrequency (%)
626 1
< 0.1%
601 1
< 0.1%
600 1
< 0.1%
563 1
< 0.1%
552 1
< 0.1%
500 1
< 0.1%
459 1
< 0.1%
450 1
< 0.1%
440 1
< 0.1%
420 1
< 0.1%

seats
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3254818
Minimum0
Maximum9
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:54.765605image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q15
median5
Q35
95-th percentile7
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80762843
Coefficient of variation (CV)0.15165359
Kurtosis3.7018063
Mean5.3254818
Median Absolute Deviation (MAD)0
Skewness2.0399825
Sum82071
Variance0.65226368
MonotonicityNot monotonic
2025-02-18T18:19:54.909848image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 12910
83.8%
7 1922
 
12.5%
8 311
 
2.0%
6 127
 
0.8%
4 77
 
0.5%
9 55
 
0.4%
2 7
 
< 0.1%
0 2
 
< 0.1%
ValueCountFrequency (%)
0 2
 
< 0.1%
2 7
 
< 0.1%
4 77
 
0.5%
5 12910
83.8%
6 127
 
0.8%
7 1922
 
12.5%
8 311
 
2.0%
9 55
 
0.4%
ValueCountFrequency (%)
9 55
 
0.4%
8 311
 
2.0%
7 1922
 
12.5%
6 127
 
0.8%
5 12910
83.8%
4 77
 
0.5%
2 7
 
< 0.1%
0 2
 
< 0.1%

selling_price
Real number (ℝ)

High correlation 

Distinct1086
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean774971.12
Minimum40000
Maximum39500000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size240.8 KiB
2025-02-18T18:19:55.083367image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum40000
5-th percentile215000
Q1385000
median556000
Q3825000
95-th percentile2050000
Maximum39500000
Range39460000
Interquartile range (IQR)440000

Descriptive statistics

Standard deviation894128.36
Coefficient of variation (CV)1.153757
Kurtosis281.98064
Mean774971.12
Median Absolute Deviation (MAD)206000
Skewness10.047048
Sum1.194308 × 1010
Variance7.9946553 × 1011
MonotonicityNot monotonic
2025-02-18T18:19:55.315182image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
450000 357
 
2.3%
550000 334
 
2.2%
650000 330
 
2.1%
350000 320
 
2.1%
500000 271
 
1.8%
400000 265
 
1.7%
600000 256
 
1.7%
750000 229
 
1.5%
300000 228
 
1.5%
525000 192
 
1.2%
Other values (1076) 12629
81.9%
ValueCountFrequency (%)
40000 1
 
< 0.1%
45000 1
 
< 0.1%
50000 3
< 0.1%
55000 1
 
< 0.1%
60000 4
< 0.1%
62000 2
 
< 0.1%
65000 5
< 0.1%
70000 5
< 0.1%
74000 1
 
< 0.1%
75000 4
< 0.1%
ValueCountFrequency (%)
39500000 1
< 0.1%
24200000 1
< 0.1%
14500000 1
< 0.1%
13000000 1
< 0.1%
11100000 1
< 0.1%
11000000 2
< 0.1%
9200000 1
< 0.1%
8500000 2
< 0.1%
8250000 1
< 0.1%
8200000 1
< 0.1%

Interactions

2025-02-18T18:19:48.882266image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:43.476260image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.402892image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.244224image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.541043image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.365824image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.122302image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.999185image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:43.614096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.519608image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.403885image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.693898image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.452503image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.245555image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:49.124692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:43.728329image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.661708image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.843296image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.832328image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.554535image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.356649image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:49.227453image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:43.861262image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.766730image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.996845image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.950483image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.650401image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.441282image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:49.325915image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:43.959910image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.900062image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.137651image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.039268image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.749173image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.531424image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:49.423778image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.105909image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.017977image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.274119image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.151739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.835884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.640962image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:49.513477image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:44.234094image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:45.111276image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:46.429317image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.256782image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:47.962146image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-02-18T18:19:48.751855image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-02-18T18:19:55.467964image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
brandenginefuel_typekm_drivenmax_powermileageseatsseller_typeselling_pricetransmission_typevehicle_age
brand1.0000.6130.2430.0000.5650.4330.3710.1840.7010.5800.112
engine0.6131.0000.2700.2730.828-0.4980.4550.0980.6430.4010.090
fuel_type0.2430.2701.0000.0000.1840.3230.1820.0770.0300.0850.111
km_driven0.0000.2730.0001.0000.096-0.0970.1980.000-0.1660.0100.576
max_power0.5650.8280.1840.0961.000-0.5080.2430.1350.7220.577-0.029
mileage0.433-0.4980.323-0.097-0.5081.000-0.4260.093-0.2790.305-0.232
seats0.3710.4550.1820.1980.243-0.4261.0000.0630.2660.1230.024
seller_type0.1840.0980.0770.0000.1350.0930.0631.0000.0450.2040.093
selling_price0.7010.6430.030-0.1660.722-0.2790.2660.0451.0000.233-0.477
transmission_type0.5800.4010.0850.0100.5770.3050.1230.2040.2331.0000.089
vehicle_age0.1120.0900.1110.576-0.029-0.2320.0240.093-0.4770.0891.000

Missing values

2025-02-18T18:19:49.726500image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-02-18T18:19:49.957035image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

car_namebrandmodelvehicle_agekm_drivenseller_typefuel_typetransmission_typemileageenginemax_powerseatsselling_price
0Maruti AltoMarutiAlto9120000IndividualPetrolManual19.7079646.305120000
1Hyundai GrandHyundaiGrand520000IndividualPetrolManual18.90119782.005550000
2Hyundai i20Hyundaii201160000IndividualPetrolManual17.00119780.005215000
3Maruti AltoMarutiAlto937000IndividualPetrolManual20.9299867.105226000
4Ford EcosportFordEcosport630000DealerDieselManual22.77149898.595570000
5Maruti Wagon RMarutiWagon R835000IndividualPetrolManual18.9099867.105350000
6Hyundai i10Hyundaii10840000DealerPetrolManual20.36119778.905315000
7Maruti Wagon RMarutiWagon R317512DealerPetrolManual20.5199867.045410000
8Hyundai VenueHyundaiVenue220000IndividualPetrolAutomatic18.15998118.3551050000
12Maruti SwiftMarutiSwift428321DealerPetrolManual16.60119785.005511000
car_namebrandmodelvehicle_agekm_drivenseller_typefuel_typetransmission_typemileageenginemax_powerseatsselling_price
19531Maruti SwiftMarutiSwift325000IndividualPetrolAutomatic22.00119781.805590000
19533Honda AmazeHondaAmaze628000DealerDieselManual25.80149898.605525000
19534Renault KWIDRenaultKWID22700DealerPetrolManual25.1779953.305395000
19535Maruti ErtigaMarutiErtiga556829DealerDieselManual20.77124888.807895000
19536Hyundai GrandHyundaiGrand59229DealerPetrolManual18.90119782.005545000
19537Hyundai i10Hyundaii10910723DealerPetrolManual19.81108668.055250000
19540Maruti ErtigaMarutiErtiga218000DealerPetrolManual17.50137391.107925000
19541Skoda RapidSkodaRapid667000DealerDieselManual21.141498103.525425000
19542Mahindra XUV500MahindraXUV50053800000DealerDieselManual16.002179140.0071225000
19543Honda CityHondaCity213000DealerPetrolAutomatic18.001497117.6051200000

Duplicate rows

Most frequently occurring

car_namebrandmodelvehicle_agekm_drivenseller_typefuel_typetransmission_typemileageenginemax_powerseatsselling_price# duplicates
10Ford EndeavourFordEndeavour348000DealerDieselAutomatic10.913198197.00728990003
74Maruti AltoMarutiAlto320000IndividualPetrolManual24.7079647.3053100003
98Maruti SwiftMarutiSwift880000IndividualDieselManual22.90124874.0053500003
155Toyota CamryToyotaCamry753000DealerPetrolAutomatic12.982494178.30510990003
157Volkswagen PoloVolkswagenPolo455000DealerPetrolManual16.20119974.0056750003
0Audi A4AudiA4551000DealerDieselAutomatic18.251968187.74518500002
1Audi A4AudiA4968370DealerDieselAutomatic17.111968174.33513450002
2BMW 5BMW502000DealerDieselAutomatic22.481995187.74548500002
3BMW 6BMW6311000DealerDieselAutomatic17.092993261.40456000002
4BMW 7BMW71370000DealerPetrolAutomatic8.452979321.0057500002